Multi-modal Unsupervised Feature Learning for RGB-D Scene Labeling

نویسندگان

  • Anran Wang
  • Jiwen Lu
  • Gang Wang
  • Jianfei Cai
  • Tat-Jen Cham
چکیده

Most of the existing approaches for RGB-D indoor scene labeling employ hand-crafted features for each modality independently and combine them in a heuristic manner. There has been some attempt on directly learning features from raw RGB-D data, but the performance is not satisfactory. In this paper, we adapt the unsupervised feature learning technique for RGB-D labeling as a multi-modality learning problem. Our learning framework performs feature learning and feature encoding simultaneously which significantly boosts the performance. By stacking basic learning structure, higher-level features are derived and combined with lower-level features for better representing RGB-D data. Experimental results on the benchmark NYU depth dataset show that our method achieves competitive performance, compared with state-of-theart.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Correlated and Individual Multi-Modal Deep Learning for RGB-D Object Recognition

In this paper, we propose a correlated and individual multi-modal deep learning (CIMDL) method for RGB-D object recognition. Unlike most conventional RGB-D object recognition methods which extract features from the RGB and depth channels individually, our CIMDL jointly learns feature representations from raw RGB-D data with a pair of deep neural networks, so that the sharable and modalspecific ...

متن کامل

Combining Models from Multiple Sources for RGB-D Scene Recognition

Depth can complement RGB with useful cues about object volumes and scene layout. However, RGB-D image datasets are still too small for directly training deep convolutional neural networks (CNNs), in contrast to the massive monomodal RGB datasets. Previous works in RGB-D recognition typically combine two separate networks for RGB and depth data, pretrained with a large RGB dataset and then fine ...

متن کامل

On the Applicability of Unsupervised Feature Learning for Object Recognition in RGB-D Data

We present a feature extraction method for RGB-D data based on k-means clustering that builds on recent work by Coates et al. Using unsupervised learning methods we are able to automatically learn feature responses that combine all available information (color and depth) into one, concise representation. We show that depth information can substantially increase the recognition performance and t...

متن کامل

Unsupervised Feature Learning for RGB-D Image Classification

Motivated by the success of Deep Neural Networks in computer vision, we propose a deep Regularized Reconstruction Independent Component Analysis network (RICA) for RGB-D image classification. In each layer of this network, we include a RICA as the basic building block to determine the relationship between the gray-scale and depth images corresponding to the same object or scene. Implementing co...

متن کامل

Cross-modal Sound Mapping Using Deep Learning

We present a method for automatic feature extraction and cross-modal mapping using deep learning. Our system uses stacked autoencoders to learn a layered feature representation of the data. Feature vectors from two (or more) different domains are mapped to each other, effectively creating a cross-modal mapping. Our system can either run fully unsupervised, or it can use high-level labeling to f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014